[parser.c] Optimize unescaping unicode by directly writing to the output buffer. by samyron · Pull Request #922 · ruby/json

samyron · 2026-01-03T04:00:54Z

This PR simplifies json_string_unescape by having convert_UTF32_to_UTF8 write directly to the output buffer. This saves a MEMCPY as the unescape_len isn't known at a compile time so it isn't optimized away. At least that's what it seems based on the samply profile:

I'm pretty sure this doesn't introduce any potential out-of-bounds write as the current code in master unconditionally writes unescape_len bytes to buffer whereas this branch passes buffer directly to convert_UTF32_to_UTF8 which writes the same unescape_len bytes to the buffer.

Before

After

Benchmarking on my M1 Macbook Air shows a 2% improvement when parsing activitypub.json depending on run.

== Parsing activitypub.json (58160 bytes)
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.059k i/100ms
Calculating -------------------------------------
               after     10.891k (± 1.0%) i/s   (91.82 μs/i) -     55.068k in   5.056759s

Comparison:
              before:    10694.4 i/s
               after:    10891.0 i/s - 1.02x  faster

Directly write to the output buffer when converting UTF32 to UTF8.

8718a33

byroot merged commit a51317c into ruby:master Jan 3, 2026
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[parser.c] Optimize unescaping unicode by directly writing to the output buffer. #922

[parser.c] Optimize unescaping unicode by directly writing to the output buffer. #922
byroot merged 1 commit intoruby:masterfrom
samyron:sm/parser-string-unescape-simplification

samyron commented Jan 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

samyron commented Jan 3, 2026

Before

After

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants